CS 345 - Project Six
Clusters vs. Sectors
A sector is a fixed division of a track on a disk. Sectors are fixed-size (typically 512 bytes on most media) and are the basic unit of access for any disk. A cluster is one or more contiguous sectors and the fundamental unit for DOS files. The DOS file system only works in terms of clusters. The FAT tables index “used” and “unused” clusters.
For FAT-12, a sector and a cluster are the same size, namely 512 bytes. The cluster size becomes the minimum file size for that system. The distinction between FAT-12, FAT-16, and FAT-32 is merely the size of one entry in the FAT table. FAT-12 systems contain 12 bits per entry. (FAT-32 systems have 32 bits per entry.) The FAT type is solely determined by the number of clusters on the media.
DOS disk layout
An MS-DOS floppy disk layout (FAT-12) consists of four major sections: the boot sector, FAT tables, root directory, and data area:
BS |
FAT
Tables |
Root
Directory (14
sectors ´
16 entries/sector = 224 entries) |
File
Clusters 2 - 2849 |
||||||||||||||||||||||||||||||
FAT
1 |
FAT
2 |
Data
Area … |
|||||||||||||||||||||||||||||||
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
16 |
17 |
18 |
19 |
20 |
21 |
22 |
23 |
24 |
25 |
26 |
27 |
28 |
29 |
30 |
31 |
32 |
33
- 2880 |
Programming Resources:
Mapping disk data directly to structures
In many instances in Project 6, you may desire to copy a set amount of bytes from the RAM disk, and transfer them directly into a structure that has the appropriate fields already defined. For example, the boot sector of the disk contains raw data that can easily be transferred into an appropriate structure. Another example is one of the 32-byte directory entries in a sector. For memory transfers where you wish to align the bits inside of a byte into structured fields, a bit field is appropriate. However, for structures that are aligned on unsigned char, unsigned short, or unsigned long boundaries, you can simply copy the memory via memcpy into the appropriate structure.
There is one catch. In order to map the disk data to a structure you must make sure that the structure is byte aligned in memory. When a compiler allocates storage space for items in a structure, it may allocate extra space in between the item elements (padding). Disks don't do that. Two directives will force the compiler's alignment of your structures to be aligned in memory.
#pragma pack(push,1) // BYTE align in memory (no padding)
// data structure goes here…
#pragma pack(pop) // End of strict alignment
BSStruct bootSector; // The
BYTE-aligned structure.
memcpy(&bootSector, &RAMDisk[0], sizeof(BSStruct))
The Boot Sector
The first sector on the volume or disk is called the boot sector and it contains information about the rest of the file system structure. You may assume that all disks and image files in this lab will conform to the FAT-12 standard. The boot sector contains the following values, which are aligned exactly as shown.
#pragma pack(push, 1) /* Byte align in memory (no padding) */
typedef struct
{ unsigned char BS_jmpBoot[3]; /* Jump instruction to the boot code */
unsigned char BS_OEMName[8]; /* Name of system that formatted the volume */
unsigned short BPB_BytsPerSec; /* Bytes per sector (should be 512) */
unsigned char BPB_SecPerClus; /* Sectors per cluster (FAT-12
= 1) */
unsigned short BPB_RsvdSecCnt; /* Reserved sectors (FAT-12 = 1) */
unsigned char BPB_NumFATs; /* FAT
tables on the disk (should be 2) */
unsigned short BPB_RootEntCnt; /* Max directory entries in root directory */
unsigned short BPB_TotSec16; /* FAT-12 total number of sectors on the disk */
unsigned char BPB_Media; /*
Media type {fixed, removable, etc.} */
unsigned short BPB_FATSz16; /* Sector size of FAT table (FAT-12 = 9) */
unsigned short BPB_SecPerTrk; /* # of sectors per cylindrical track */
unsigned short BPB_NumHeads; /* # of heads per volume (1.4Mb 3.5" = 2) */
unsigned long BPB_HiddSec; /* # of
preceding hidden sectors (0) */
unsigned long BPB_TotSec32; /* # of FAT-32
sectors (0 for FAT-12) */
unsigned char BS_DrvNum; /*
A drive number for the media (OS specific) */
unsigned char BS_Reserved1; /* Reserved space
for Windows NT (set to 0) */
unsigned char BS_BootSig; /*
(0x29) Indicates following: */
unsigned long BS_VolID; /* Volume serial # (for tracking this disk) */
unsigned char BS_VolLab[11]; /* Volume label (RDL or
"NO NAME ") */
unsigned char BS_FilSysType[8]; /* Deceptive FAT type Label */
} BSStruct;
#pragma pack(pop) /* End strict alignment */
Organization of the file allocation tables (FAT)
The FAT-12 tables are organized as a
linear array of FAT entries. A FAT entry is a single 12 bit value whose
index matches the associated cluster in the data area of the disk. Because the
first cluster in the data area of the disk corresponds to cluster 2, the first
two FAT entries in the FAT table are reserved. You can think of the FAT
table as a large array that contains pointers into the data area of the disk.
It is also a linked-list of sorts, because you retrieve files that are longer
than one cluster by referencing the FAT table, which points you to the next
cluster of that file or directory.
A 12-bit FAT entry may have the following values
Cluster chains are a series of FAT entries that point to the next cluster in the file/directory, and are terminated by an EOC indicator, exactly like linked-lists. For example, in the graphic above, the cluster chain for File1.txt is retrieved by starting at the first cluster indicated in the directory for that file (the bottom box in the above graphic) and following the pointers in the FAT table until you encounter the EOC terminator. The chain would comprise clusters 2, 4, 6, and 7. The cluster chain for MyDir is 3, 5. You know that you've reached the end of a file or directory listing when you check the corresponding FAT entry for the current cluster, and find that it contains the EOC flag.
When you are modifying and updating the FAT table, make sure that you update both copies of the FAT table! Command chkdsk should validate a disk volume by comparing the FAT tables to make sure that they are identical. Any discrepancies will cause a chkdsk error.
There are 3072 FAT entries in each FAT table (512 bytes per sector * 9 sectors = 4608 bytes. 4608 bytes / 1.5 bytes per FAT entry = 3072 FAT entries). Be aware that this is a few more entries than the disk has clusters; therefore, maximum capacity of the disk should be determined by the total number of clusters, not the number of possible FAT table entries.
Programming Resources:
Tracking free space
In the MS-DOS FAT-12 file system, there is no structure or system that keeps track of how much free space is available on the disk. Thus, in order to manage this information yourself, you have several choices.
A file name mask is used to selectively qualify directory entries. A wildcard is a special character that can fill in for missing letter(s) in file and directory names when used to add specificity to OS commands. Although there are various ways to define the syntax and semantics of a mask string, we will use the following rules:
Examples: *.* All files
B* All files beginning with ‘B’ and no extension
???*.C All files of length at least 3 with ‘C’ extension
*.TXT All files with “TXT” extension
FAT-12 file name and
extension representation
File names in DOS traditionally have a limit of 8 characters for the name, and 3 characters for the extension. (Modern FAT file systems allow for longer file names. See Microsoft's White Pages--the good part for details.) In your File Management System, you will need to translate the file names that you are given, into a form that you can use to search directory entries. Here are a few things to be aware of:
Here are examples of how some file names would translate into the 11 bytes allocated for the file/directory name and extension in the directory entry (white space between quotes should be considered as spaces).
filename
[01234567012]
"foo.bar" Þ
"FOO BAR"
"FOO.BAR" Þ
"FOO BAR"
"Foo.Bar" Þ
"FOO BAR"
"foo" Þ
"FOO "
"foo." Þ
"FOO "
"PICKLE.A" Þ
"PICKLE A "
"prettybg.big" Þ
"PRETTYBGBIG"
".big" Þ illegal!
file/directory names cannot begin with a "."
Directory structures and
their fields
A directory entry contains all of the information necessary to reference the contents of a file or directory on the volume, including additional information about the file/directory's attributes and other information including: time of creation/modification, date created, and size of the file (in bytes). All of this information is packed into a small 32-byte structure. Directories with more than 16 entries require more than one sector (maximum entries per sector = 512 bytes per sector / 32 bytes per directory entry = 16 directory entries).
Directories in the data area of the disk can be of any length (number of sectors). However, because the root directory is of fixed size, only a set number of directory entries will fit.
Subdirectories, directories other than the root directory, also contain two directory entries by default. These are the dot [.] and dotdot [..] entries--you have probably seen these before. These entries are simply pointers, the first [dot] points to the current directory, kind of like the "this" pointer in C++. The second entry [dotdot] is a pointer to the parent directory. This is the directory that you are specifying with the command "CD ..". A directory that includes these two entries is still considered empty.
A directory entry contains the following fields:
#pragma pack(push, 1) /* Byte align in memory (no padding) */
typedef struct
{ unsigned char Name[8]; /* File name (capital letters, padded w/spaces) */
unsigned char Extension[3]; /* Extension (same format as name,
no '.') */
unsigned char Attributes; /* Holds the attributes
code */
unsigned char Reserved[10]; /* Reserved for Windows NT (Set
to zero!) */
unsigned short Time; /* Time of last write */
unsigned short Date; /* Date of last write */
unsigned short startCluster; /* Pointer to the first cluster of the file */
unsigned long fileSize; /* File size in bytes */
} DirEntry;
#pragma pack(pop) /* End strict alignment */
The file/directory's attributes can have the following values. Consider using #define for these within your FMS.
#define N_FILE 0x00
#define READ_ONLY 0x01
#define HIDDEN 0x02
#define SYSTEM 0x04
#define VOLUME 0x08 // this is the volume
label entry
#define DIRECTORY 0x10
#define ARCHIVE 0x20 // same as file
For example, a file with read-only,
hidden, system, and archive attributes set might be created by using the
bitwise OR operator.
unsigned short attributes = 0x00;
attributes = READ_ONLY | HIDDEN | SYSTEM | ARCHIVE; //
0x27
The directory and archive attributes are mutually exclusive, that is, a file cannot be both a directory and archived at the same time. The only way to distinguish a file from a directory is by checking the attribute.
For subdirectories that refer back to the root directory, the startCluster value will be 0. Note that cluster 0 is reserved in the FAT table--attempting to look up that value should result in an error. You must provide a check that will re-direct a startCluster of zero to the root directory.
When creating directory entries for new files that have nothing in them (a fileSize of 0), the startCluster should always be set to 0; otherwise chkdsk should report an error. Just remember to change the startCluster to the correct cluster after you have written to the file.
How DOS deletes
files/directories
DOS does not immediately erase a file or directory when it is deleted. In fact, it does nothing to the clusters that contain the information (this is why it is sometimes possible to un-delete something). However, DOS does zero out the file/directory's cluster chain from the FAT table and places a special character (0xe5) in the first byte of the directory entry signaling that this entry has been deleted. You will need to do the same when a file or directory is deleted. Start with the cluster indicated in the directory entry, traverse the cluster chain, and set each FAT entry to zero including the EOC entry. When reading directory entries, ignore all entries that begin with 0xe5.
Programming Resources: Retrieving 12-bit entries from the FAT table
The FAT-12 table consists of an array of 12-bit entries that correspond to clusters in the data area of the disk. Retrieving a single FAT entry is slightly complicated because we cannot easily create a data structure of 12-bit or 1½ unsigned char entries. You may store the FAT table any way you wish, but the following code sample assumes that you've chosen to store the FAT table in an array of unsigned chars. Because we chose this, we will need to grab an entire unsigned short that covers the 12-bit FAT entry, with four bits to spare, and then extract just the 12 bits that we want (either the high-order or low-order bits, depending on whether the original FAT index was even or odd). The functions for retrieving and setting a FAT entry are provided below.
//
Return an unsigned short containing the 12-bit FAT entry code at FATindex
unsigned short GetFatEntry(int
FATindex)
{
unsigned short
FATEntryCode; // The return value
int FatOffset = ((FATindex * 3) / 2); // Calculate the offset
if (FATindex % 2 ==
1) // If the index is odd
{ //
Pull out a unsigned short from a unsigned char array
FATEntryCode
= *((unsigned short *) &FAT[FatOffset]);
FATEntryCode >>= 4; // Extract the high-order 12 bits
}
else // If the index is even
{ //
Pull out a unsigned short from a unsigned char array
FATEntryCode
= *((unsigned short *) &FAT[FatOffset]);
FATEntryCode
&= 0x0fff; // Extract the low-order 12 bits
}
return FATEntryCode;
} //
End GetFatEntry
void
SetFatEntry(int FATindex, unsigned short FAT12ClusEntryVal)
{
int
FATOffset = ((FATindex * 3) / 2); // Calculate the offset
int FATData =
*((unsigned short*)&FAT[FATOffset]);
if
(FATindex % 2 == 0) // If the index is even
{ FAT12ClusEntryVal
&= 0x0FFF; // mask to 12 bits
FATData
&= 0xF000; // mask complement
}
else // Index is odd
{ FAT12ClusEntryVal <<= 4; //
move 12-bits high
FATData
&= 0x000F; // mask complement
}
// Update FAT
entry value in the FAT table
*((unsigned short *)&FAT[FATOffset]) = FATData |
FAT12ClusEntryVal;
} //
End SetFatEntry
Programming Resources: Formatting and printing directory entries
According to the MS-DOS white pages, a directory entry whose first byte (the first byte of the file name) starts with 0xe5 is considered a deleted entry, and should not be printed. Also, an entry whose first byte starts with 0x00 indicates that this directory entry is empty and that none of the directory entries following that entry are valid.
The following function example takes a directory entry structure, formats a directory listing, and prints the resulting string. All these structures need to be BYTE aligned with #pragma directive.
#pragma
pack(push,1) // BYTE align in memory (no padding)
typedef
struct
{ // (total 16 bits--a unsigned short)
unsigned short sec: 5; // low-order 5 bits are the seconds
unsigned short min: 6; // next 6 bits are the minutes
unsigned short hour: 5; // high-order 5 bits are the hour
}
FATTime;
#pragma
pack(pop) // End of strict alignment
#pragma
pack(push,1) // BYTE align in memory (no padding)
typedef struct
{ // (total 16 bits--a unsigned short)
unsigned short day: 5; // low-order 5 bits are the day
unsigned short month: 4; // next 4 bits are the month
unsigned short year: 7; // high-order 7 bits are the year
} FATDate;
#pragma pack(pop) // End of strict alignment
void
PrintDirectoryEntry(DirEntry dirent)
{
int i = 7;
char p_string[64] = "
----- 00:00:00
03/01/2004";
char tempstring[10];
FATDate date; // The Date bit field structure
FATTime time; // The Time bit field structure
strncpy(p_string,
(char*)&dirent.Name, 8); // Copies 8 bytes from the name
while (p_string[i] == ' ') i--;
p_string[i+1] = '.'; // Add extension
strncpy(&p_string[i+2],
(char*)&dirent.Extension, 3);
while (p_string[i+4] == ' ') i--;
if (p_string[i+4] == '.') p_string[i+4] = ' ';
//
Generate the attributes
if(dirent.Attributes &0x01)
p_string[14] = 'R';
if(dirent.Attributes &0x02)
p_string[15] = 'H';
if(dirent.Attributes &0x04)
p_string[16] = 'S';
if(dirent.Attributes &0x08)
p_string[17] = 'V';
if(dirent.Attributes &0x10)
p_string[18] = 'D';
if(dirent.Attributes &0x20)
p_string[19] = 'A';
//
Extract the time and format it into the string
memcpy(&time,
&dirent.Time, sizeof(FATTime));
sprintf(&p_string[21],
"%02d:%02d:%02d", time.hour, time.min, time.sec*2);
//
Extract the date and format it into the string
memcpy(&date,
&dirent.Date, sizeof(FATDate));
sprintf(p_string, "%s
%02d/%02d/%04d %d %d", p_string,
date.month,
date.day, date.year+1980,
dirent.startCluster,
dirent.fileSize);
printf("\n%s", p_string); //
p_string is now ready!
return;
} // end
PrintDirectoryEntry
Programming Resources: Formatting and printing the FAT
The PrintFatTable function shows one method of formatting and printing the FAT table for easy viewing. This function organizes the FAT table into 10 columns. It relies on the helper function GetFatEntry that retrieves a FAT table entry when given an index into the FAT table.
void PrintFatTable(int n)
{
char tbuf[16];
char fbuf[100];
unsigned short fatvalue;
int size = (512 * 9) / 1.5; // Max fat entries in the FAT table
if (n < size) size = n;
for (int i=0; i<size; i++)
{ if
((i%10) == 0)
{ if
(i) printf("%s", fbuf);
sprintf(fbuf,
"\n %6d:",
i);
}
fatvalue =
GetFatEntry(i);
// Special
formatting cases...
if (i < 2)
sprintf(tbuf, " RSRV"); // A reserved cluster
else if
(fatvalue == 4095) sprintf(tbuf, "
EOC"); // The EOC marker
else if
(fatvalue == 4087) sprintf(tbuf, "
BAD"); // The BAD cluster marker
else
sprintf(tbuf,"%5d", fatvalue); "); // Output fat value
strcat(fbuf, tbuf);
}
return;
} // End PrintFatTable
Programming Time
#include <stdio.h>
#include <time.h>
#pragma pack(push,1) //
BYTE align in memory (no padding)
typedef struct
{ //
(total 16 bits--a unsigned short)
unsigned short sec: 5; //
low-order 5 bits are the seconds
unsigned short min: 6; //
next 6 bits are the minutes
unsigned short hour: 5; //
high-order 5 bits are the hour
} FATTime;
#pragma pack(pop) //
End of strict alignment
#pragma pack(push,1) //
BYTE align in memory (no padding)
typedef struct
{ //
(total 16 bits--a unsigned short)
unsigned short day: 5; //
low-order 5 bits are the day
unsigned short month: 4; //
next 4 bits are the month
unsigned short year: 7; //
high-order 7 bits are the year
} FATDate;
#pragma pack(pop) //
End of strict alignment
#pragma pack(push,1) //
BYTE align in memory (no padding)
typedef struct
{ unsigned char name[8]; // File name
unsigned char extension[3]; // Extension
unsigned char attributes; // Holds the attributes code
unsigned char reserved[10]; // Reserved
// unsigned short time;
// Time of last write
// unsigned short date;
// Date of last write
FATTime time;
// Time of last write
FATDate date;
// Date of last write
unsigned short startCluster; //
Pointer to the first cluster of the file.
unsigned long fileSize; //
File size in bytes
} DirEntry;
#pragma pack(pop) //
End of strict alignment
int main()
{
DirEntry dirEntry;
time_t a;
struct tm *b;
FATDate *d;
FATTime *t;
// capture local time and date
time(&a);
b = localtime(&a); //
get local time
d = (FATDate*)&(dirEntry.date); // point to date w/in
dir entry
d->year = b->tm_year + 1900 - 1980; // update year
d->month = b->tm_mon; //
update month
d->day = b->tm_mday; //
update day
t = (FATTime*)&(dirEntry.time); // point to time w/in
dir entry
t->hour = b->tm_hour; //
update hour
t->min = b->tm_min; // update minute
t->sec = b->tm_sec; //
update second
// convert back
struct tm xx, *x = &xx;
x->tm_wday = 0;
x->tm_yday = 0;
x->tm_isdst = 0;
d = (FATDate*)&(dirEntry.date); // point to date w/in
dir entry
x->tm_year = d->year + 1980 - 1900; // update year
x->tm_mon = d->month; //
update month
x->tm_mday = d->day; //
update day
t = (FATTime*)&(dirEntry.time); // point to time w/in
dir entry
x->tm_hour = t->hour; //
update hour
x->tm_min = t->min; //
update minute
x->tm_sec = t->sec; //
update second
char buf[100];
int size;
size = strftime(buf, 99, "%A, %B %d, %Y
%H:%M:%S", x);
printf("\nDate: %s", buf);
printf("\nDate: %s", asctime(x));
return 0;
}
Programming Extras: Microsoft's
White Pages
Microsoft's Hardware White Pages document 34 pages of everything you ever wanted (or didn’t want) to know about FAT. Use this document to learn more about the boot sector, how to determine the type of FAT file system, and long file and directory names.